Python Data Analytics by 2023

Python Data Analytics by 2023

Author:2023
Language: eng
Format: epub


Chapter 6 ■ pandas in depth: data Manipulation

In addition, after an operation of aggregation, the names of some columns may not be very meaningful.

In fact it is often useful to add a prefix to the column name that describes the type of business combination.

Adding a prefix, instead of completely replacing the name, is very useful for keeping track of the source data from which they derive aggregate values. This is important if you apply a process of transformation chain (a series or dataframe is generated from another), because it is important to keep some reference with the source data.

>>> means = frame.groupby('color').mean(numeric_only=True).add_prefix('mean_')>>> means mean_price1 mean_price2

color

green 2.025 2.375

red 2.380 2.435

white 5.560 4.750

Functions on Groups

Although many methods have not been implemented specifically for use with GroupBy, they actually work correctly with data structures as the series. You saw in the previous section how easy it is to get the series by a GroupBy object, by specifying the name of the column and then by applying the method to make the calculation. For example, you can use the calculation of quantiles with the quantiles() function.

>>> group = frame.groupby('color')

>>> group['price1'].quantile(0.6)

color

green 2.170

red 2.744

white 5.560

Name: price1, dtype: float64

You can also define your own aggregation functions. Define the function separately and then pass it as an argument to the mark() function. For example, you can calculate the range of the values of each group.

>>> def range(series):

... return series.max() - series.min()

...

>>> group['price1'].agg(range)

color

green 1.45

red 3.64

white 0.00

Name: price1, dtype: float64

You can also use more aggregate functions at the same time, with the mark() function passing an array containing the list of operations to be done, which will become the new columns.

>>> group['price1'].agg(['mean','std',range])

mean std range

color

green 2.025 1.025305 1.45

red 2.380 2.573869 3.64

white 5.560 NaN 0.00

178



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.